Testing Probability Distributions Underlying Aggregated Data
نویسندگان
چکیده
In this paper, we analyze and study a hybrid model for testing and learning probability distributions. Here, in addition to samples, the testing algorithm is provided with one of two different types of oracles to the unknown distribution D over [n]. More precisely, we define both the dual and cumulative dual access models, in which the algorithm A can both sample from D and respectively, for any i ∈ [n], • query the probability mass D(i) (query access); or • get the total mass of {1, . . . , i}, i.e. ∑i j=1 D(j) (cumulative access) These two models, by generalizing the previously studied sampling and query oracle models, allow us to bypass the strong lower bounds established for a number of problems in these settings, while capturing several interesting aspects of these problems – and providing new insight on the limitations of the models. Finally, we show that while the testing algorithms can be in most cases strictly more efficient, some tasks remain hard even with this additional power.
منابع مشابه
Testing a Point Null Hypothesis against One-Sided for Non Regular and Exponential Families: The Reconcilability Condition to P-values and Posterior Probability
In this paper, the reconcilability between the P-value and the posterior probability in testing a point null hypothesis against the one-sided hypothesis is considered. Two essential families, non regular and exponential family of distributions, are studied. It was shown in a non regular family of distributions; in some cases, it is possible to find a prior distribution function under which P-va...
متن کاملComparing population distributions from bin-aggregated sample data: An application to historical height data from France.
We develop a methodology to estimate underlying (continuous) population distributions from bin-aggregated sample data through the estimation of the parameters of mixtures of distributions that allow for maximal parametric flexibility. The statistical approach we develop enables comparisons of the full distributions of height data from potential army conscripts across France's 88 departments for...
متن کاملProbability Distribution Fitting to Maternal Mortality in Nigeria.
The consequences of Maternal Mortality (MM) cannot be overemphasized. It inhibits population growth resulting into loss of lives among others. This work tends to obtain the maternal mortality rates (MMR) in Nigeria, identify some fitted distributions to MMR and determine which of the distributions best fits the data. A comprehensive Exploratory Data Analysis (EDA) was carried on MM and the MMRs...
متن کاملStatistical Analysis Design Including Biostatistics
1. The need for statistical data analysis 2. Principles of statistical analysis 2.1. Probability – The Foundation of Statistics 2.2. Basic Axioms of Probability Theory Based on Set Theory. 2.3. Types of Probability Distributions 2.4. Outcome and Expectation 2.5. Estimation and Statistical Inference 2.6. Estimators and their Distributions 3. Strategies for statistical data analysis 3.1. Hypothes...
متن کاملParameter Estimation and Cooperative Effects in Queueing Networks
This paper is devoted to probability-statistical analysis of Jackson opened and closed networks. A problem of an estimation of product limit distributions parameters using load coefficients of network nodes is solved. Cooperative effects in aggregated opened and closed networks are investigated and optimization procedures of their limit deterministic characteristics are constructed. Formulas of...
متن کامل